首页> 外文OA文献 >Parallel solver for shifted systems in a hybrid CPU-GPU framework
【2h】

Parallel solver for shifted systems in a hybrid CPU-GPU framework

机译:混合CpU-GpU框架中移位系统的并行求解器

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

This paper proposes a combination of a hybrid CPU--GPU and a pure GPUsoftware implementation of a direct algorithm for solving shifted linearsystems $(A - \sigma I)X = B$ with large number of complex shifts $\sigma$ andmultiple right-hand sides. Such problems often appear e.g. in control theorywhen evaluating the transfer function, or as a part of an algorithm performinginterpolatory model reduction, as well as when computing pseudospectra andstructured pseudospectra, or solving large linear systems of ordinarydifferential equations. The proposed algorithm first jointly reduces thegeneral full $n\times n$ matrix $A$ and the $n\times m$ full right-hand sidematrix $B$ to the controller Hessenberg canonical form that facilitatesefficient solution: $A$ is transformed to a so-called $m$-Hessenberg form and$B$ is made upper-triangular. This is implemented as blocked highly parallelCPU--GPU hybrid algorithm; individual blocks are reduced by the CPU, and thenecessary updates of the rest of the matrix are split among the cores of theCPU and the GPU. To enhance parallelization, the reduction and the updates areoverlapped. In the next phase, the reduced $m$-Hessenberg--triangular systemsare solved entirely on the GPU, with shifts divided into batches. The benefitsof such load distribution are demonstrated by numerical experiments. Inparticular, we show that our proposed implementation provides an excellentbasis for efficient implementations of computational methods in systems andcontrol theory, from evaluation of transfer function to the interpolatory modelreduction.
机译:本文提出了混合CPU-GPU和纯GPU软件结合的直接算法,用于解决线性位移$(A-\ sigma I)X = B $且具有大量复杂位移$ \ sigma $和多个right-手边。这类问题经常出现,例如在控制理论中,当评估传递函数时,或者作为执行插值模型约简的算法的一部分,以及在计算伪谱和结构化伪谱时,或在求解大的常微分方程线性系统时,所提出的算法首先将通用的全$ n \ timesn $矩阵$ A $和$ n \ times $$完全右手边矩阵$ B $简化为便于有效求解的控制器Hessenberg规范形式:$ A $被转换为所谓的$ m $ -Hessenberg形式,而$ B $则制成上三角。这是作为阻塞的高度并行CPU-GPU混合算法实现的; CPU减少了单个块,并将其余矩阵的必要更新分配给CPU和GPU的内核。为了增强并行化,缩减和更新被重叠。在下一阶段,简化后的$ m $ -Hessenberg-三角系统将完全在GPU上解决,并将班次分为几批。数值实验证明了这种负荷分配的好处。特别是,我们表明,我们提出的实现方法为系统方法和控制理论的有效实现(从传递函数的评估到插值模型的简化)提供了极好的基础。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号